A Novel Method for Disease Prediction: Hybrid of Random Forest and Multivariate Adaptive Regression Splines
نویسندگان
چکیده
Using data mining technology for disease prediction and diagnosis has become the focus of attention. Data mining technology provides an important means for extracting valuable medical rules hidden in medical data and acts as an important role in disease prediction and clinical diagnosis. This paper surveys some kind of popular data mining techniques for disease prediction and diagnosis, such as decision tree, associated rule analysis and clustering analysis. Then, a novel hybrid method of random forest and multivariate adaptive regression splines is proposed for building disease prediction model. Firstly, random forest algorithm is used to perform a preliminary screening of variables and to gain an importance ranks. Then, the new dataset selected by top-k important predictors is input into the MARS procedure, which is responsible for building interpretable models for predicting disease survivability. The capability of this combination method is evaluated using basic performance measurements (e.g., accuracy, sensitivity, and specificity) along with a 10-fold crossvalidation. Experimental results show that the proposed method provides a higher accuracy and a relatively simple model.
منابع مشابه
Seismic Data Forecasting: A Sequence Prediction or a Sequence Recognition Task
In this paper, we have tried to predict earthquake events in a cluster of seismic data on pacific ring of fire, using multivariate adaptive regression splines (MARS). The model is employed as either a predictor for a sequence prediction task, or a binary classifier for a sequence recognition problem, which could alternatively help to predict an event. Here, we explain that sequence prediction/r...
متن کاملGENETIC PROGRAMMING AND MULTIVARIATE ADAPTIVE REGRESION SPLINES FOR PRIDICTION OF BRIDGE RISKS AND COMPARISION OF PERFORMANCES
In this paper, two different data driven models, genetic programming (GP) and multivariate adoptive regression splines (MARS), have been adopted to create the models for prediction of bridge risk score. Input parameters of bridge risks consists of safe risk rating (SRR), functional risk rating (FRR), sustainability risk rating (SUR), environmental risk rating (ERR) and target output. The total ...
متن کاملHybrid Method of Logistic Regression and Data Envelopment Analysis for Event Prediction: A Case Study (Stroke Disease)
Abstract Predictive analytics is an area of statistics that deals with extracting information from data and using it to predict trends and behavior patterns. Many mathematical modeling has been developed and used for prediction, and in some cases, they have been found to be very strong and reliable. This paper studies different mathematical and statistical approaches for events prediction. The ...
متن کاملESTIMATING DRYING SHRINKAGE OF CONCRETE USING A MULTIVARIATE ADAPTIVE REGRESSION SPLINES APPROACH
In the present study, the multivariate adaptive regression splines (MARS) technique is employed to estimate the drying shrinkage of concrete. To this purpose, a very big database (RILEM Data Bank) from different experimental studies is used. Several effective parameters such as the age of onset of shrinkage measurement, age at start of drying, the ratio of the volume of the sample on its drying...
متن کاملRetrieval Algorithm for Conical-Scanning Microwave Imagers Aided by Random Forest, RReliefF, and Multivariate Adaptive Regression Splines
This study proposes a rain rate retrieval algorithm for conicalscanning microwave imagers (RAMARS), as an alternative of the NASA Goddard Profiling (GPROF) algorithm, that does not rely on any a priory information. The fundamental basis of the RAMARS follows the concept of the GPROF algorithm, which means, being consistent with the TRMM PR rain rate observations, but independent of any auxiliar...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- JCP
دوره 8 شماره
صفحات -
تاریخ انتشار 2013